Goto

Collaborating Authors

 Minneapolis


RefLoRA: Refactored Low-Rank Adaptation for Efficient Fine-Tuning of Large Models

Neural Information Processing Systems

Low-Rank Adaptation (LoRA) lowers the computational and memory overhead of fine-tuning large models by updating a low-dimensional subspace of the pretrained weight matrix. Albeit efficient, LoRA exhibits suboptimal convergence and noticeable performance degradation, due to inconsistent and imbalanced weight updates induced by its nonunique low-rank factorizations. To overcome these limitations, this article identifies the optimal low-rank factorization per step that minimizes an upper bound on the loss. The resultant refactored low-rank adaptation (RefLoRA) method promotes a flatter loss landscape, along with consistent and balanced weight updates, thus speeding up stable convergence. Extensive experiments evaluate RefLoRA on natural language understanding, and commonsense reasoning tasks with popular large language models including DeBERTaV3, LLaMA-7B, LLaMA2-7B and LLaMA3-8B. The numerical tests corroborate that RefLoRA converges faster, outperforms various benchmarks, and enjoys negligible computational overhead compared to state-of-the-art LoRA variants.


AGeneral-Purpose Theorem for High-Probability Bounds of Stochastic Approximation with Polyak Averaging

Neural Information Processing Systems

Polyak-Ruppert averaging is a widely used technique to achieve the optimal asymptotic variance of stochastic approximation (SA) algorithms, yet its high-probability performance guarantees remain underexplored in general settings. In this paper, we present a general framework for establishing non-asymptotic concentration bounds for the error of averaged SA iterates. Our approach assumes access to individual concentration bounds for the unaveraged iterates and yields a sharp bound on the averaged iterates. We also construct an example, showing the tightness of our result up to constant multiplicative factors. As direct applications, we derive tight concentration bounds for contractive SA algorithms and for algorithms such as temporal difference learning and Q-learning with averaging, obtaining new bounds in settings where traditional analysis is challenging.


Will this high-tech lounge change how you wait at airports?

FOX News

This material may not be published, broadcast, rewritten, or redistributed. Quotes displayed in real-time or delayed by at least 15 minutes. Market data provided by Factset . Powered and implemented by FactSet Digital Solutions . Mutual Fund and ETF data provided by LSEG .


Will Ken Paxton Hand Democrats a Texas Senate Seat?

Slate

Paxton trounces Cornyn in the Texas Senate Republican primary runoff; Trump waffles between a losing "peace deal" and a return to war in Iran; and congressional candidate Alex Bores makes the case for AI regulation. Please enable javascript to get your Slate Plus feeds. If you can't access your feeds, please contact customer support. Check your phone for a link to finish setting up your feed. Please enter a valid phone number.


Amazon rolls out its new 30-minute delivery option in a number of cities across the US

Engadget

Amazon is rolling out its ultra-fast delivery service, Amazon Now, in dozens of cities in the US, promising deliveries of groceries and household essentials in 30 minutes or less. Amazon says the service is also now widely available in Atlanta and Dallas-Fort Worth, and will rapidly expand into Austin, Houston, Minneapolis, Orlando, Phoenix, Denver, Oklahoma City and more throughout the rest of 2026. If Amazon Now is available in your area you'll see a 30-Minute Delivery option in the Amazon app or on the homepage when you're in a browser. Amazon Now offers will also be highlighted when you're browsing products. You can search by category, and as well as groceries and basic household items such as eggs, diary and laundry detergent, you can also order select electronics on the service, which Amazon says operates 24 hours a day in most places.


May Day rallies sweep US, demanding reforms for working-class rights

Al Jazeera

Roughly 500 labour groups across the United States have organised a widespread economic blackout calling for "no school, no work, no shopping" to mark May Day, also known as International Workers' Day. The events, organised as part of an initiative called May Day Strong, were inspired by economic boycotts following ramped-up immigration enforcement operations in Minneapolis, Minnesota, and the deaths of US citizens Renee Good and Alex Pretti in January. May Day Strong has a broad set of demands, including "tax the rich" and abolishing Immigration and Customs Enforcement (ICE) -- a call that comes as Republicans voted on Wednesday on a budgetary measure that would fund the agency under the Department of Homeland Security. It also calls for ending war and "expanding democracy", according to a statement from the group. While the tent is broad in nature, organisers stressed that it is a result of a wide set of challenges facing the US worker.


MLB pitcher's wife dealt cancer blow ahead of birth of their first child

FOX News

This material may not be published, broadcast, rewritten, or redistributed. Quotes displayed in real-time or delayed by at least 15 minutes. Market data provided by Factset . Powered and implemented by FactSet Digital Solutions . Mutual Fund and ETF data provided by LSEG .


Causality-Encoded Diffusion Models for Interventional Sampling and Edge Inference

arXiv.org Machine Learning

Diffusion models [1, 2, 3] have emerged as a powerful class of generative models, achieving state-of-the-art performance across a wide range of applications, including imaging [2] and scientific-data synthesis [4]. From a statistical perspective, they can be viewed as flexible nonparametric estimators of a (conditional) distribution via score estimation and reverse-time stochastic differential equations (SDEs) [5, 6]. Despite this expressive power, standard diffusion models are typically causality-agnostic: they learn a joint law without encoding the directional asymmetries required for causal interpretation. As a consequence, they do not, on their own, provide principled answers to interventional queries or support broader causal analyses, which are central to structural causal models (SCMs) [7]. When a causal ordering (or a directed acyclic graph) is available, it is natural to construct generative procedures that sample variables sequentially according to the causal factorisation. Such iterative, ordering-respecting approaches have been proposed using a variety of generative models, including generative adversarial networks [8], variational autoencoders [9], normalising flows [10], and diffusion-based constructions such as DDIM [11]. However, a rigorous statistical understandingof the advantages of exploitingsuch causalstructureand the inferential use of the resulting generator remain less developed.


Palantir Employees Are Starting to Wonder if They're the Bad Guys

WIRED

Palantir Employees Are Starting to Wonder if They're the Bad Guys Interviews with current and former Palantir employees, along with internal Slack messages obtained by WIRED, suggest a workforce in turmoil. It took just a few months of President Donald Trump's second term for Palantir employees to question their company's commitments to civil liberties . Last fall, Palantir seemed to become the technological backbone of Trump's immigration enforcement machinery, providing software identifying, tracking, and helping deport immigrants on behalf of the Department of Homeland Security (DHS), when current and former employees started ringing the alarm. Right as they picked up the call, one of them asked, "Are you tracking Palantir's descent into fascism?" "That was their greeting," the other former employee says.


Anchor-Free Correlated Topic Modeling: Identifiability and Algorithm

Neural Information Processing Systems

In topic modeling, many algorithms that guarantee identifiability of the topics have been developed under the premise that there exist anchor words - i.e., words that only appear (with positive probability) in one topic. Follow-up work has resorted to three or higher-order statistics of the data corpus to relax the anchor word assumption. Reliable estimates of higher-order statistics are hard to obtain, however, and the identification of topics under those models hinges on uncorrelatedness of the topics, which can be unrealistic. This paper revisits topic modeling based on second-order moments, and proposes an anchor-free topic mining framework. The proposed approach guarantees the identification of the topics under a much milder condition compared to the anchor-word assumption, thereby exhibiting much better robustness in practice. The associated algorithm only involves one eigendecomposition and a few small linear programs. This makes it easy to implement and scale up to very large problem instances. Experiments using the TDT2 and Reuters-21578 corpus demonstrate that the proposed anchor-free approach exhibits very favorable performance (measured using coherence, similarity count, and clustering accuracy metrics) compared to the prior art.